智能论文笔记

Minorities in networks and algorithms

Fariba Karimi , Marcos Oliveira , Markus Strohmaier

分类：人工智能

2022-06-14

在本章中，我们概述了数据驱动和理论知觉的社交网络复杂模型及其在理解社会不平等和边缘化方面的潜力。我们专注于网络和基于网络的算法以及它们如何影响少数群体引起的不平等现象。特别是，我们研究了同质和混合偏见如何塑造大小社交网络，影响少数民族的感知并影响协作模式。我们还讨论了网络和网络的动态过程以及规范和健康不平等的形成。此外，我们认为网络建模是揭示排名和社会推荐算法对少数群体可见性的影响至关重要的。最后，我们强调了这个新兴研究主题中的主要挑战和未来机会。

translated by 谷歌翻译

Gait Recognition Based on Deep Learning: A Survey

Claudio Filipi Gonçalves dos Santos , Diego de Souza Oliveira , Leandro A. Passos , Rafael Gonçalves Pires , Daniel Felipe Silva Santos , Lucas Pascotti Valem , Thierry P. Moreira , Marcos Cleison S. Santana , Mateus Roder , João Paulo Papa

分类：计算机视觉 | 机器学习

2022-01-10

通常，基于生物谱系的控制系统可能不依赖于各个预期行为或合作适当运行。相反，这种系统应该了解未经授权的访问尝试的恶意程序。文献中提供的一些作品建议通过步态识别方法来解决问题。这些方法旨在通过内在的可察觉功能来识别人类，尽管穿着衣服或配件。虽然该问题表示相对长时间的挑战，但是为处理问题的大多数技术存在与特征提取和低分类率相关的几个缺点，以及其他问题。然而，最近的深度学习方法是一种强大的一组工具，可以处理几乎任何图像和计算机视觉相关问题，为步态识别提供最重要的结果。因此，这项工作提供了通过步态认可的关于生物识别检测的最近作品的调查汇编，重点是深入学习方法，强调他们的益处，暴露出弱点。此外，它还呈现用于解决相关约束的数据集，方法和体系结构的分类和表征描述。

translated by 谷歌翻译

Least-Squares Linear Dilation-Erosion Regressor Trained using a Convex-Concave Procedure

Angelica Lourenço Oliveira , Marcos Eduardo Valle

分类：机器学习 | 人工智能 | (统计)机器学习

2021-07-12

本文介绍了用于回归任务的混合形态神经网络，称为线性扩张 - 渗透回归器（$ \ ell $ -der）。$ \ ell $ - 由线性和形态操作员组成的凸组合给出。它们产生连续的分段线性函数，因此是通用近似值。除了介绍$ \ ell $ - $德模型外，我们还将其培训作为凸（DC）编程问题的差异。确切地说，$ \ ell $ - 通过使用凸 - 孔隙过程（CCP）最小化最小二乘的培训。使用多个回归任务的计算实验证实了所提出的回归器的功效，表现优于其他混合形态模型和最先进的方法，例如多层感知器网络和Radial-BASIS支持矢量回归器。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

SOLD: Sinhala Offensive Language Dataset

Tharindu Ranasinghe , Isuri Anuradha , Damith Premasiri , Kanishka Silva , Hansi Hettiarachchi , Lasitha Uyangodage , Marcos Zampieri

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-01

The widespread of offensive content online, such as hate speech and cyber-bullying, is a global phenomenon. This has sparked interest in the artificial intelligence (AI) and natural language processing (NLP) communities, motivating the development of various systems trained to detect potentially harmful content automatically. These systems require annotated datasets to train the machine learning (ML) models. However, with a few notable exceptions, most datasets on this topic have dealt with English and a few other high-resource languages. As a result, the research in offensive language identification has been limited to these languages. This paper addresses this gap by tackling offensive language identification in Sinhala, a low-resource Indo-Aryan language spoken by over 17 million people in Sri Lanka. We introduce the Sinhala Offensive Language Dataset (SOLD) and present multiple experiments on this dataset. SOLD is a manually annotated dataset containing 10,000 posts from Twitter annotated as offensive and not offensive at both sentence-level and token-level, improving the explainability of the ML models. SOLD is the first large publicly available offensive language dataset compiled for Sinhala. We also introduce SemiSOLD, a larger dataset containing more than 145,000 Sinhala tweets, annotated following a semi-supervised approach.

translated by 谷歌翻译

Embedding generation for text classification of Brazilian Portuguese user reviews: from bag-of-words to transformers

Frederico Dias Souza , João Baptista de Oliveira e Souza Filho

分类：自然语言处理 | 人工智能

2022-12-01

Text classification is a natural language processing (NLP) task relevant to many commercial applications, like e-commerce and customer service. Naturally, classifying such excerpts accurately often represents a challenge, due to intrinsic language aspects, like irony and nuance. To accomplish this task, one must provide a robust numerical representation for documents, a process known as embedding. Embedding represents a key NLP field nowadays, having faced a significant advance in the last decade, especially after the introduction of the word-to-vector concept and the popularization of Deep Learning models for solving NLP tasks, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformer-based Language Models (TLMs). Despite the impressive achievements in this field, the literature coverage regarding generating embeddings for Brazilian Portuguese texts is scarce, especially when considering commercial user reviews. Therefore, this work aims to provide a comprehensive experimental study of embedding approaches targeting a binary sentiment classification of user reviews in Brazilian Portuguese. This study includes from classical (Bag-of-Words) to state-of-the-art (Transformer-based) NLP models. The methods are evaluated with five open-source databases with pre-defined data partitions made available in an open digital repository to encourage reproducibility. The Fine-tuned TLMs achieved the best results for all cases, being followed by the Feature-based TLM, LSTM, and CNN, with alternate ranks, depending on the database under analysis.

translated by 谷歌翻译

Weakly-supervised detection of AMD-related lesions in color fundus images using explainable deep learning

José Morano , Álvaro S. Hervella , José Rouco , Jorge Novo , José I. Fernández-Vigo , Marcos Ortega

分类：计算机视觉

2022-12-01

Age-related macular degeneration (AMD) is a degenerative disorder affecting the macula, a key area of the retina for visual acuity. Nowadays, it is the most frequent cause of blindness in developed countries. Although some promising treatments have been developed, their effectiveness is low in advanced stages. This emphasizes the importance of large-scale screening programs. Nevertheless, implementing such programs for AMD is usually unfeasible, since the population at risk is large and the diagnosis is challenging. All this motivates the development of automatic methods. In this sense, several works have achieved positive results for AMD diagnosis using convolutional neural networks (CNNs). However, none incorporates explainability mechanisms, which limits their use in clinical practice. In that regard, we propose an explainable deep learning approach for the diagnosis of AMD via the joint identification of its associated retinal lesions. In our proposal, a CNN is trained end-to-end for the joint task using image-level labels. The provided lesion information is of clinical interest, as it allows to assess the developmental stage of AMD. Additionally, the approach allows to explain the diagnosis from the identified lesions. This is possible thanks to the use of a CNN with a custom setting that links the lesions and the diagnosis. Furthermore, the proposed setting also allows to obtain coarse lesion segmentation maps in a weakly-supervised way, further improving the explainability. The training data for the approach can be obtained without much extra work by clinicians. The experiments conducted demonstrate that our approach can identify AMD and its associated lesions satisfactorily, while providing adequate coarse segmentation maps for most common lesions.

translated by 谷歌翻译

Graph Convolutional Network for Multi-Target Multi-Camera Vehicle Tracking

Elena Luna , Juan Carlos San Miguel , José María Martínez , Marcos Escudero-Viñolo

分类：计算机视觉

2022-11-28

This letter focuses on the task of Multi-Target Multi-Camera vehicle tracking. We propose to associate single-camera trajectories into multi-camera global trajectories by training a Graph Convolutional Network. Our approach simultaneously processes all cameras providing a global solution, and it is also robust to large cameras unsynchronizations. Furthermore, we design a new loss function to deal with class imbalance. Our proposal outperforms the related work showing better generalization and without requiring ad-hoc manual annotations or thresholds, unlike compared approaches.

translated by 谷歌翻译

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

Otávio Parraga , Martin D. More , Christian M. Oliveira , Nathan S. Gavenski , Lucas S. Kupssinskü , Adilson Medronha , Luis V. Moura , Gabriel S. Simões , Rodrigo C. Barros

分类：机器学习 | 人工智能 | 自然语言处理 | 计算机视觉

2022-11-10

Despite being responsible for state-of-the-art results in several computer vision and natural language processing tasks, neural networks have faced harsh criticism due to some of their current shortcomings. One of them is that neural networks are correlation machines prone to model biases within the data instead of focusing on actual useful causal relationships. This problem is particularly serious in application domains affected by aspects such as race, gender, and age. To prevent models from incurring on unfair decision-making, the AI community has concentrated efforts in correcting algorithmic biases, giving rise to the research area now widely known as fairness in AI. In this survey paper, we provide an in-depth overview of the main debiasing methods for fairness-aware neural networks in the context of vision and language research. We propose a novel taxonomy to better organize the literature on debiasing methods for fairness, and we discuss the current challenges, trends, and important future work directions for the interested researcher and practitioner.

translated by 谷歌翻译

Efficient Single-Image Depth Estimation on Mobile Devices, Mobile AI & AIM 2022 Challenge: Report

Andrey Ignatov , Grigory Malivenko , Radu Timofte , Lukasz Treszczotko , Xin Chang , Piotr Ksiazek , Michal Lopuszynski , Maciej Pioro , Rafal Rudnicki , Maciej Smyl

分类：计算机视觉

2022-11-07

Various depth estimation models are now widely used on many mobile and IoT devices for image segmentation, bokeh effect rendering, object tracking and many other mobile tasks. Thus, it is very crucial to have efficient and accurate depth estimation models that can run fast on low-power mobile chipsets. In this Mobile AI challenge, the target was to develop deep learning-based single image depth estimation solutions that can show a real-time performance on IoT platforms and smartphones. For this, the participants used a large-scale RGB-to-depth dataset that was collected with the ZED stereo camera capable to generated depth maps for objects located at up to 50 meters. The runtime of all models was evaluated on the Raspberry Pi 4 platform, where the developed solutions were able to generate VGA resolution depth maps at up to 27 FPS while achieving high fidelity results. All models developed in the challenge are also compatible with any Android or Linux-based mobile devices, their detailed description is provided in this paper.

translated by 谷歌翻译